NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Identification of mixtures of discrete product distributions in near-optimal sample and time complexity

Gordon, Spencer L; Jahn, Erik; Mazaheri, Bijan; Rabani, Yuval; Schulman, Leonard J (June 2024, Proceedings of Machine Learning Research)
Agrawal, Shipra; Roth, Aaron (Ed.)
We consider the problem of \emph{identifying,} from statistics, a distribution of discrete random variables $$X_1 \ldots,X_n$$ that is a mixture of $$k$$ product distributions. The best previous sample complexity for $$n \in O(k)$$ was $$(1/\zeta)^{O(k^2 \log k)}$$ (under a mild separation assumption parameterized by $$\zeta$$). The best known lower bound was $$\exp(\Omega(k))$$. It is known that $$n\geq 2k-1$$ is necessary and sufficient for identification. We show, for any $$n\geq 2k-1$$, how to achieve sample complexity and run-time complexity $$(1/\zeta)^{O(k)}$$. We also extend the known lower bound of $$e^{\Omega(k)}$$ to match our upper bound across a broad range of $$\zeta$$. Our results are obtained by combining (a) a classic method for robust tensor decomposition, (b) a novel way of bounding the condition number of key matrices called Hadamard extensions, by studying their action only on flattened rank-1 tensors.
more » « less
Full Text Available
Identifiability of Product of Experts Models

Gordon, Spencer L; Kant, Manav; Ma, Eric Y; Schulman, Leonard J; Staicu, Andrei C (May 2024, Proceedings of Machine Learning Research)

Product of experts (PoE) are layered networks in which the value at each node is an AND (or product) of the values (possibly negated) at its inputs. These were introduced as a neural network architecture that can efficiently learn to generate high-dimensional data which satisfy many low-dimensional constraints---thereby allowing each individual expert to perform a simple task. PoEs have found a variety of applications in learning. We study the problem of identifiability of a product of experts model having a layer of binary latent variables, and a layer of binary observables that are iid conditional on the latents. The previous best upper bound on the number of observables needed to identify the model was exponential in the number of parameters. We show: (a) When the latents are uniformly distributed, the model is identifiable with a number of observables equal to the number of parameters (and hence best possible). (b) In the more general case of arbitrarily distributed latents, the model is identifiable for a number of observables that is still linear in the number of parameters (and within a factor of two of best-possible). The proofs rely on root interlacing phenomena for some special three-term recurrences.
more » « less
Full Text Available
Hadamard Extensions and the Identification of Mixtures of Product Distributions

https://doi.org/10.1109/TIT.2022.3146630

Gordon, Spencer L.; Schulman, Leonard J. (June 2022, IEEE Transactions on Information Theory)

Full Text Available

Search for: All records